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Abstract. We consider the problem of approximating a multiple-input multiple-output (MIMO) 

pxm rational transfer function H(s) of high degree by another pxm rational transfer function H{s) of 

much smaller degree, so that the 7^2 norm of the approximation error is minimized. We characterize 

the stationary points of the W2 norm of the approximation error by tangential interpolation conditions 

and also extend these results to the discrete-time case. We analyze whether it is reasonable to assume 

that lower-order models can always be approximated arbitrarily closely by imposing only first-order 

- - - interpolation conditions. Finally, we analyze the 7-{2 norm of the approximation error for a simple 

00 ' case in order to illustrate the complexity of the minimization problem. 

(_P , Key Avords. Multivariable systems, model reduction, optimal 7^2 approximation, tangential 

Cn ' interpolation. 
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^D \ 1. Introduction. In this paper, wc consider the problem of approximating a 

^^ ■ real pxm rational transfer function H{s) of McMillan degree TV by a real pxm 

rational transfer function H{s) of lower McMillan degree n using the H2-iiorm as the 
^^ ' approximation criterion. We refer, e.g., to |Che99[ lAntOSj for the relevant background 

^^ \ on linear system theory and model reduction. 

Since a transfer function has an unbounded ?i2-iiorm if it is not strictly proper, we 

will constrain both H[s) and H{s) to be strictly proper (i.e., they are zero at s = 00). 
2 ' Such transfer functions have minimal (i.e., controllable and observable) state-space 

realizations {A, B, C) e M^^^ xR^^" xRp^^ and (1, B, C) G E"^" xIR"^'" x]Rp><" 

satisfying 

>; {ylcx,^^"' H{s):^C{sI^-A)-'B, (1.1) 

O '• and 

Tij- ' J ^ ^ ^^ + ^"' 

t^ ■ \y = Cx, 

QQ ; where u G M", y, y e Rp, a; G R^, x G R". 

f^ . We also look at the equivalent formulation in the discrete-time case where the 

dynamical systems become 



^ 






H{s):^C{sIn-A)-^B, (1.2) 



Xk+i - Axk + Buk ^(^^ _ ^( _ ^^-1^ (^3^ 

Vk = Cxk 
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and 

I Xk+i ^ Axk + Bu fj-, , p^, 7n-i3 /I A\ 

< ^ ;::,^ H{z) -.^C^zln- A) B. (1.4) 

y yk = Cxk 

Expressions for the gradients of the squared 7i2-norni error function 



J(A.B,c) ■■ (A B, C) ^ \\C{sIn - A)-^B - C{sl,, - A)-^B^^^ 



■H2 



have been known since the work of Wilson |Wil70| (the expressions are recalled in 
Theorem 13. 2p . One can object, however, that the full parameterization 



{A,B,C)^H{s)=C{sIn-A)-^B (1.5) 

is not one to one, since the triple 

{At,Bt,Ct) ■■= iT-^AT,T-^B,CT) 
for any matrix T e GL(n,]R) defines the same transfer function : 
H{s) = d{slr, - A)-'B = Crisln - At)-'Bt, 
or 

H{z) = C(z/„ - A)-'B = Crizln - AtY^Bt. 

If one could eliminate the ri^ degrees of freedom of the invertible transformation 
T, one could hope to fully parameterize the target system H{s) or H[z) with only 
n(m + p) independent parameters, and to turn Wilson's conditions into n{m + p) 
nonredundant scalar conditions. Concerning the parameterization task, Byrnes and 
Falb |BF791 Th. 4.7] show that the set Rat" ^ oipxru strictly proper rational transfer 
functions of degree n can be parameterized with only n{m + p) real parameters in 
a locally smooth manner; but it is also shown there that there exists no globally 
smooth parameterization of Rat" „j if min(p, in) > 1. The task of extracting n{m-\-p) 
nonredundant conditions out of Wilson's conditions of stationarity is more delicate, 
as we shall see. 

It has been shown in |VGA08| and stated in |GAB07| that, when they have 
only first-order poles, the stationary points H{s) of the 7i2-norm error function (i.e., 
the points where the gradient of J[a,b,c) vanishes) can be characterized in diagonal 
canonical form 

H{s)^J2^^ (1.6) 

via tangential interpolation conditions which can be formulated as 

bf[H^{.s)-H^{s)]d, = 0{s + %f. 

Notice that the interpolation points are the negative of the poles of H{s). These re- 
sults are, in fact, a consequence of the relation between the equations of the gradients 
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of the 7i2-norni error (as derived originally by Wilson in [Wil70j ) and tangential in- 
terpolation based on Sylvester equations (as derived in |BGR90j . [GVV04J . [GVV05| ) . 
Similar conditions can be found in [BKVW07] for the discrete-time case. Observe 
that the diagonal canonical form (|1.6p uses the minimal number, n{m + p), of pa- 
rameters once the bi's or Ci's are normalized to remove the scaling invariance. The 
tangential interpolation conditions also impose the correct number of nonredundant 
scalar conditions (see Section HTTj) . However, in view of the result of Byrnes and Falb, 
the diagonal canonical form (jl.6p — as well as any other canonical form — cannot yield 
a globally smooth one-to-one parameterization of Rat" ,„ when min(p, m) > 1. Singu- 
larities appear when H{s) has higher-order poles. This is also true for discrete-time 
systems. 

In this paper, we characterize the stationary points H{s) or H{z) of the 7^2- 
norm error function in Jordan canonical form, i.e., without the assumption that they 
have only first-order poles. The stationarity conditions elegantly generalize to higher- 
order tangential interpolation conditions of degree h — 1 (in the sense of jGVV05| ) , 
where fc,; is the size of the ith Jordan block. The interpolation points remain the 
negative of the poles A^ of H{s), and the interpolation directions arc polynomial 
vectors of degree fc,; — 1, built from the Jordan- form equivalents of bi and q; see 
Theorem 14.81 We also show that these tangential interpolation conditions contain 
n{m + p) nonredundant scalar conditions. The result in Theorem 14.81 has several 
precursors: Aigrain and Williams |AW49| for the SISO case with simple real poles, 
Meier and Luenberger [ML67J for the general SISO case (see also the alternative 
derivation in [GAB07J ). and [VGA08| for the MIMO case with simple poles (see also 
the remark in |GAB07| ). 

Since the set of systems with higher-order poles is nowhere dense in Ratp„j, 
the generalization of the stationarity conditions to higher-order poles seems to be 
chiefly of theoretical interest. Nevertheless, we argue that the case of higher-order 
poles cannot be simply brushed aside. First, we show on an example that 7Y2-optimal 
reduced-order models with higher-order poles do occur. Second, we point out that the 
Jordan canonical form changes in a nonsmooth manner at the higher-order poles and 
that the tangential interpolation conditions for 7i2-norm stationary points become ill 
conditioned around the systems H(s) with higher-order poles. Therefore, insisting on 
the Jordan canonical form parameterization of the 7Y2-optimal reduced-order model 
may seriously affect the sensitivity of any numerical algorithm using such a parame- 
terization. When the influence of a nearby higher-order pole becomes problematic, a 
possible remedy is to exploit the full parameterization (jl.Sp . 

It should be kept in mind that the above discussion only concerns stationarity 
conditions for the 7i2-norm error function. The stationary points may be local min- 
ima, saddle points, or local maxima of the 7Y2-norm error function. When a descent 
iteration is employed, convergence to saddle points and local maxima is not expected 
to occur. However, the method can still be trapped in local, nonglobal minima. Such 
spurious local minima exist in the 7i2-optimal model reduction problem, as we show 
on a simple example. Computing an 7Y2-optimal reduced-order model is thus a tough 
(obviously nonconvex) optimization task. Nevertheless, the computed local minima 
tend to yield approximations that are considered satisfactory in practice, hence the 
interest for interpolation-based fixed-point type algorithms as revived recently in, 
e.g., [BG07l[GM7l|Gug02| . 

The paper is organized as follows. After presenting in Section [2] the necessary 
background material on the 7^2 approximation problem, in Section [3] we recall Wil- 



4 P. VAN DOOREN AND K. GALLIVAN AND P.-A. ABSIL 

son's formulas for the gradient of the TC2-norm error function. In Section IH Wil- 
son's first-order optimality conditions are expressed in a tangential interpolation form 
obtained by representing the reduced-order model in Jordan canonical form — thus 
covering the case of higher-order poles in the reduced-order model. The link to tan- 
gential interpolation by means of projection matrices that solve Sylvester equations 
is discussed in Section [3 The importance of dealing with the case of higher-order 
poles is illustrated in Section El Section [7] shows on a simple example that the H2- 
optimal model reduction problem is a difficult optimization problem, with spurious 
local minimizcrs in which local optimization algorithms may get trapped. An overview 
of algorithms for solving the 7i2-optimal approximation problem is given in Section [Sj 
The discrete-time case is covered in Section^ and conclusions are drawn in SectionfTUl 

2. The 7^2 approximation problem. Much of the material in this section is 
standard and can be found in, e.g., |Ant05j . Let E(s) be an arbitrary strictly proper 
transfer function, with realization triple (Ae, B^., C'e)- If E(s) is unstable, its ?i2-norm 
is defined to be 00. Otherwise, its squared 7i2-norm is defined as the trace of a matrix 
integral : 

\\E{s)\\i,^ ■.= ti:J^^E{ju;)"E{ju;)^^trJ^^E{ju;)EUu;f^. (2.1) 

By Parseval's identity, this can also be expressed using the state space reahzation as 

\\E{sWh, = tr / [Ce exp^=* B,][Ce exp^<=* B.fdt 

JO 
/•oo 

= tr / [Ce cxp^"* Bef[Ce cxp^=* Be]dt. 
Jo 

This can also be related to an expression involving the gramians Pe and Qe defined 
as 

/•oo /*oo 

Pe:= [exp^=*Se][exp^=*Be]^di, Qe-= [exp^=*Be]^[Ceexp^=*]di, 

Jo Jo 

which arc also known to be the solutions of the Lyapunov equations 

A,Pe + PeAl + BeBj = 0, QeA + A'^Q, + CjCe = 0. (2.2) 

Using these, it easily follows that the squared 7i2-norm of E{s) can be expressed as 

\\E{s)\\j^.^ = tr BjQeSe = tr CePeCj ■ (2.3) 

Remark 2.1. It is easy to show that if A^ has a single real eigenvalue A that 
tends to zero, i.e., A^ tends to lose its stability : 

AeX = Ax, y'^Ae = \y^ , A ^ 

then Pe and Qe tend to a rank one matrix of infinite norm, since 

Pe^xx^P/{2X), Qe^yy^i/{2X), where (3^y'^BeBjy, -f = x'^CjCjx. 

It then follows that J — > /37/(2A) also becomes infinite. Similar behavior is also 
found for complex conjugate pairs of eigenvalues tending to the imaginary axis. It 
thus follows that the squared T-i2-norm of E(s) tends to infinity as soon as Ae looses 
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and the Lyapunov equations (|2.2ll become 
\A 



Pe-^ 


p 


X 

p 


and 






Qe:= 


' Q 


Y' 

Q. 



A 



P X 

X^ P 



P X 

XT p 



r^T 



1^ 



its stability. This explains why this norm is typically defined to he infinite when E{s) 
is unstable. 

We now apply this to the error function 

E{s) := H{s) - H{s) = C{sIn - A)-^B - d{sl„ - A)-^B. 

A reahzation of E{.s) in partitioned form is given by 

{A,,B,,C,):^(\^ ^IJsl'k -C\), (2.4) 

= 0, 

(2.5) 

= 0. 

(2.6) 

In order to minimize the 7i2-distance ||i?(s) — H{s)\\'^^ of the low-order system 
H{s) = C{sln - A)-^B to a given the full-order model H{s) = C{sIm ~ A^^B, we 
must minimize the function J[a.b,c) defined by 

J{A,B.c){A B, C) = \\C{sIn - A)-'B - d{sl„ - A)-'Bf^^. (2.7a) 

We will frequently omit the subscript in J^(a,b.c){^j Bj C) when the full-order model 
is clear from the context. In view of ()2.3p . J [A, B, C) admits the formulation 



A^ 





-Q Y 
Y^ Q 


+ 


Q Y 
Y^ Q 




'A 

A 


+ 





B^ B' 



C -C 



B^ B^ 




"Q Y 

Y^ Q 




B 
B 



tr (B^QB + 2B^ YB + B^QB] , 



J{A,B,C) =tr 

(2.7b) 
where Q, Y and Q depend on A, A, C and C through the Lyapunov equation (|2.6p . 
or equivalently 

- ^T- 



J{A, B, C) = tr 



C -C 



P X 

X^ P 



= tr (CPC^ - 2CXCT + CPC^ 



(2.7c) 



where P, X and P depend on A, A, B and B through the Lyapunov equation 
Note that the terms B^QB and CPC^ in the above expressions are constant, and 
hence can be discarded in the optimization. 

Remark 2.2. The Sylvester equations (|2.5p and (j2.6p are nonsingular if and only 
if the union of the spectra of A and A does not contain any pair of opposite points 
(see llGan59l Ch. VI]). In particular, they are nonsingular if the transfer functions 
II{s) = C{sIn — A)^^ B and II{s) = C{sl„ — A)^^B are stable. In fact, the function 

{A, B,C,A,B,d)^ J(A,B,c) {A, B, C) 

is smooth around every point where II{s) and II{s) are stable. In particular, when 
II(s) is stable, the function 

{A,B,d)^J^A,B,c){AB,C) 

is smooth around every point where II{s) is stable. 
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3. Gradients of the squared ?-^2-norm error function. The expansions 
above can be used to obtain formulas for the gradients of the squared 7Y2-norni error 
function J versus A, 13, and C. We define the gradients as foUows. 

Definition 3.1. The gradients of a real-valued function f{A, B, C) of a real ma- 
trix variables A e R"^", B G R"><", C £ M^^", are the real matrices V^/(A, B, C) G 
^nxn^ \j^f{A,B,C) e M"^™, Vg/(l,i?,C) e W'\ defined by 

[V^/(AB,a)],.,=-=^/(l,B,C), i^l,...,n, j^l,...,n, 
oA^,.j 

[ygf{A,B,C)],.j=-^f{A,B,d), i = l,...,n, j = l,...,m, 
dB,j 

[Vg/(AB,a)],,,= -1-/(1,5,5), z = l,...,p, i = l,...,n. 
oCij 

We wiU write V^/ as a compact notation for V^/(A, B, C) when the argument is 
clear from the context. 

Starting from the characterizations (|2.5l2.7cp and (|2.6l2.7bp of the 7^2 norm, 
one can derive succinct forms of the gradients. This theorem is originally due to 
Wilson [Wil70j . but we state here the version derived in [VGA08| . where a proof 
based on inner products and traces is given. 

Theorem 3.2. The gradients V^^, ^ qJ and '^ qJ of the squared Ti2-norm 
error J^ (j2.7p . where both {A,B,C) and {A,B,C) are minimal (i.e., controllable and 
observable) , are given by 

V^J^2{QP + Y^X), VgJ^2{QB + Y^B), Vg J = 2((7P - CX), (3.1) 

where 

A^Y + YA-C^C^Q, P'Q + QA + d^d = Q, (3.2) 

X'^A'^ + AX'^ + BB'^ = 0, PA^ + AP + BB^ = 0. (3.3) 

The gradient forms of Theorem 13.21 allowed us to derive in [VGA08] a theorem 
that also provides an important link to tangential interpolation by projection. 

4. Stationarity conditions in Jordan form. In this section, we revisit Wil- 
son's conditions fThcorcm l3.2[l with H{s) in Jordan canonical form. We first consider 
the continuous-time case and discuss the discrete-time case in Sectional 

We will assume that both transfer functions H{s) and H{s) have real minimal 
(controllable and observable) realizations {A,B,C) and {A,B,C). 

4.1. First-order poles. We first assume that all the poles of H{s) are distinct 
(but possibly complex), which implies that the Jordan canonical form reduces to a 
diagonal form. 

Since the number of parameters in the full parameterization ()1.5|1 is not minimal, 
the gradient conditions of Theorem 13.21 must be redundant. This is made explicit in 
the theorem below, proved in |VG A08| . For this wc will need Si, tf , the (complex) 
left and right eigenvectors of the (real) matrix A corresponding to the (complex) 
eigenvalue A^. We then have : 

As.^Xs,, Cs,^:c„ tfA^X,tf, tfB^:%f, i = l,...,n, (4.1) 
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and H{s) has the partial fraction expansion 

His) = J2^, (4.2) 

where bi G C™ and q G C and where {(A^, bi,Ci):i — l,..., n} is a self-conjugate 
set. The form (|4.2p corresponds to the diagonal canonical form of H{s), a particular 
case of the Jordan canonical form when all the Jordan blocks have dimension one. It 
involves the minimal number n{m + p) of parameters once normalization conditions 
are imposed on cither the fo^'s or the q's. 

Theorem 4.1. Let H{s) = C{sIn - A)-^B and H{s) = C{sly, - A)-^B he real 
minimal realizations, and let Xi, bi, c,;, Si, and ti, i = I, . . . ,n, be as in (j4.ip . Assume 
that —Xi is not a pole of H{s), i ^ I, . . . ,n. Then 

i(Vg J)^s, = [H^{-%) - H^{-%m (4.3) 

iif (VgJ)^ = &f [i/^(-A.) - H^i-X)] (4.4) 

ltf{V;ijfs, = ^ -^ f6f(VgJ)^.,-ff(VgJ)^g,-], z ^ J, (4.6) 
z(Ai — Xj) 

where J is the squared Ti2-norm error defined in (|2.7p . 

Let S* := [si ... s„] , then the above theorem shows that the off-diagonal 
elements of S~^{'V^J)^S actually depend on (S/gJ)^ and [V qJY" . Therefore one 
need only impose conditions on diag(5^^(V^j7')"^5') and on {V gj)^ and (VgJ')-^ 
to characterize stationary points of J . The following corollary easily follows. It is 
derived independently in |iBK VW07j for the discrete-time case, and also suggested 
in |G!AB07| . 

Corollary 4.2. With the notation and assumptions of Theorem \4.1\ if (^ gJ^)'^ = 
0, (Vp J)^ = and diag(S'-i(V^J)^S') == 0, then V ^J == and the following tan- 
gential interpolation conditions are satisfied for all Xi, i = 1, . . . ,n : 

[H^{-X,) - H^{-X^)]c^ = 0, (4.7) 

bf[H^{-X,)-H^{-K)]^0, (4.8) 

if^[H^{s)-H^{s)] ?. = 0. (4.9) 

as s=-\i 

These tangential interpolation conditions contain n(rn + p) nonredundant condi- 
tions. To see this, fix i and consider first the case where Xi is real. The first two 
equations (|4.7p and (|4.8p impose that the determinant of [H^[—Xi) — H'^[—Xi)] van- 
ishes, which accounts for one real scalar condition. Next, (|4.7p and (|4.8p require that 
Ci and bi belong to the kernel of [H'^ {—Xi) — H'^{~Xi)], which imposes p — 1 and 
m — 1 real scalar conditions. Finally, the last equation (|4.9p imposes one real scalar 
condition, for a total of to 4- p conditions corresponding to the fixed i. In the com- 
plex case, we have a pair of complex-conjugate poles Xi and A^+i. The constraint 
det[iJ-^(— Ai) — H'^{—Xi)] = imposes two real scalar conditions, the first two equa- 
tions impose further 2{p— 1) and 2(to— 1) real scalar conditions, and the last equation 
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imposes two real scalar conditions, for a total of 2{m + p) real scalar conditions. The 
equations for Ai+i impose the same conditions since equations (|4.7p - (|4.9p are then 
just the complex conjugate ones as for A^. The total for Ai and A^+i is thus 2{m + p) 
real scalar conditions. To conclude, observe that i ranges from 1 to n, which yields 
a total of n{m + p) real scalar conditions. This matches the number, n{m + p), of 
independent parameters. 

The above conditions can also be expressed in terms of the Taylor expansion of 
His) -His) : 

[H^is) - H^{s)]d, = 0{s + \,), b^[H^is) - H^{s)] = 0{s + K), 

hf[H'^is)-H^{s)]Z,^Ois + %)\ 

That formulation is in fact easier to extend to higher-order poles. Observe also that 
wc retrieve the conditions of Meier and Lucnberger |ML67| for the single-input single- 
output (SISO) case since then bf and q are just nonzero scalars that can be divided 
out. The above conditions then become the 2n conditions 



i^(^)'^^-^^^i 



H{-\) = H{-\), - His)l_j^^ = - His) 



_ , i = 1,. 



When the transfer function His) has repeated first-order poles, the results are 
essentially the same except that there are bases Si and T^ of right and left invariant 
subspaces corresponding to a single eigenvalue Xi . Wc then have 

ASi ~ XiSi, CSi = Ci, Tj A = XiTi , T^ B ^ B^ , T^ Si ^ Ik- 

Theorcms l4.1l and l4.2l still hold but with the vectors Ci and h^ replaced by the matrices 
Ci and B^ . It may seem that this implies that we then impose more than nim + p) 
conditions, but in fact one can choose the individual vectors of Si and T/^ such that 
the off diagonal elements of T^^ {V ^J)^ Si are zero. Only its diagonal elements need 
then to be constrained to be zero to force the stationarity conditions. 

4.2. Higher-order poles. Let us now allow His) to have multiple and higher- 
order poles. The main result is given in Theorem 14. 81 where we show that the station- 
ary points of the 7i2-norm error function are characterized by tangential interpolation 
conditions whose degree depends on the size of the Jordan blocks of His). The result 
generalizes Corollarv l4.2l 

Let His) then have the following minimal (controllable and observable) represen- 
tation 



His) = J2 ^'(■'*)' -^'(■'*) •= ^'(^^ - ^^y'Bf, A, := 



A, -1 
A, 



-1 
A,- 



(4.10) 
where A, G C'^^^x'^-, Bf e C'^-x", C, e C^^'^-' and where {(I,;,i3f , C,) -.i = !,...,£} 
is a self-conjugate set. Notice that this is essentially the partial fraction expansion 
of His) and that there may be more than one Jordan block Ai associated with the 
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same complex eigenvalue A,;. The minimality of the representation implies linear 
independence of the leading columns in each block Bi and of the trailing rows in 
each block Ci that correspond to the same eigenvalue A^ , since these blocks appear as 
subblocks of a minimal realization of H{s). 

We will need Si, T/^, the (complex) left and right cigcnspaces of the (real) matrix 
A corresponding to the (complex) eigenvalue A^ . Because of the expansion (|4.2p , we 
then have : 



Aoi — ^i-^i, ^ Ji — '^i 



nH 



T.^A = A,Tl , 



T^B = Bf, 



T"S., = h 



(4.11) 



Note also that the matrices Si and T/^ arc not unique. When there is only one Jordan 
block associated with an eigenvalue Ai, its degree of freedom is just a block scaling 
SiDi and D~^TI^ with Di £ C'""'^'^' invertible. When there is more than one Jordan 
block associated with Ai, the degrees of freedom arc more involved, but we associate 
below right and left bases Si, Ti with each individual Jordan block Ai. 

We will also need the following lemmas in preparation for the main theorem. 
Lemma 4.3. If ^X is not an eigenvalue of A, the solution of the matrix equation 

A -1 

A ■■■ 



A^Y + YF - C^L = with F := 



eC 



kxk 



with L := [£o ^i • ■ • ^fc-i] , is given by 

Y = [(A^ + AJ)-iC^ {A^ + \I)-^C^ ... [A^ + AJ)-*^C^] 



h 



Kfc-l 



h 
4 



Moreover, let 



0a(s):=[1 (s + A) 



(s + A) 



fc-ii 



y{s):=YMs), 



then 



y{s) = {A' - sir'C L(bxis) + Ois + X)' 



which means that the ith column yi of Y is also the coefficient of (s + A)' ^ in the 
Taylor expansion of {A^ — sI)~'^C'^L(j)\{s). 

Proof The first part easily follows from (A^ + XI)yi = C'^Iq and (A^ + XI)yi = 
C'^£i-i + yi-i, i > 1. The second part follows from the identity 

oo 

(A^ - siy'c^ - ^(s + xy-^A"^ + xiy^c^ 



and from the convolution of this formal scries with the polynomial vector L(j)\{s). D 
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We also give the dual version of this lemma. 

Lemma 4.4. //—A is not an eigenvalue of A, the solution of the matrix equation 

X"A^ + FX" - R"b'^ = 

with F e ([l^xk ^g above and R := Wk~i ?'fe-2 ■ • ■ tq], is given by 



X 



H 



„H ^H 



„H 



'fe-1 






B^(A^ + A/)-i 



(s + A) 1], x" (s) := Ms)X" , 



Moreover, let 

Ms) ■■= [is + x)"-' . 

then 

x"{s) = ^Px{s)R"B^{A^ - sl)-^ + 0{s + A)'^ 



which means that the ith row x^ of X^ is also the coefficient of (s + A)'~^ in the 
Taylor expansion of ip\{s)R^ B^ (A'^ — sl)^^ . 

Proof. The proof is just the dual of the previous lemma. D 

We first obtain an expression for Vjj^J and Vgj7 that exploits the Jordan canon- 
ical form. The result generalizes formulas (|4.3p and (|4.4p to higher-order poles. 

Theorem 4.5. Let H{s) = C{sIn - A)-'^B and H{s) = C{sl„ - A)-'^B be real 
minimal realizations, and let Ai, Bi, Ci, Si, and Ti, i = 1, . . . ,£, describe the Jordan 
canonical form of H{s) as in (|4.10p and (|4.1ip . Assume that —A, is not a pole of 
H{s), i = I, . . . ,£. Define 



V'5;,(s):= [(s + A,)^-'-^ ... (s + A,) ij , 03^^(s):-[l (.s + A,) 

Then we have 
1 



(S + X^) 



ki-1 



7^(V5 J)^5,03^^ (s) = [H^{s) - H^{s)]a^j^^ (s) + Ois + A,)"n (4.12) 

(4.13) 
FT,. Then we 



-i,^^^{s)Tf{VsJ)^ = ij~,Xs)B^[H^{s) H^is)] + 0{s + %)''^ 



where J is the squared 'H2-norm error defined in (J2.7I1 . 

Proof Define Yi := YSi, Q, := -QS^, X, := -XT, and Pi 
have 



A Yi -\- YiAi — G Ci, A Qi + CjiAi — C C-i, 

X^A^ + MXf = Bf'B^, Pl'I' + A,P^ = fif 5^. 

If —\i is not an eigenvalue of A or A, both {A^ — sl)^^ and [A^ — sl)^^ have Taylor 
expansions in (s -f Ai). It then follows from Lemmas 14.31 and 14.41 that 



Y,,c 



. (s) = {A^ - sI)-^C^C,(l)^, (s) + 0{s + A,)''- , 



Ai^ 



g,0r. (s) = (A^ - .s/)-iC^a05-. (s) + 0{s + A,)'\ 



Ai^ 



(4.14) 
(4.15) 
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Vx. (s)Xf = Va. (s)Bf B^(A^ - si)-' + 0{s + %)"' , (4.16) 



V'T {s)Pf = Vt. (s)Bf B^(i^ - s/)-i + 0(,s + A,:)'^'-. (4.17) 



This then yields 
1 



i^B'^fS,4>^S^) - (B^O+B^r)5,(/)5;^(s) = [if^(.s)-F^(.s)]a03^^(s) + O(s+A,)'\ 



lv5;.(^)7;''(Vc^)^ = V'3;,(5)T^^(PC^-^^C^) = V^x^(.s)Sf [if^(,s)-i?^(,s)]+0(,s+A,)'^' 

D 

Remark 4.6. T/ie condition that — Ai is noi a pole of H{s) is satisfied when 
choosing stable interpolation points Xi, which is typically the case in the algorithms 
we discuss below. 

The following generalization of the tangential interpolation conditions (j4.7p and (|4.8p 
immediately follows from the previous theorem. 

Corollary 4.7. With the notation and assumptions of Theorem \4.5\ ifVgJJ = 
and Vpj7 = 0, then the following tangential interpolation conditions are satisfied 
for all Xi,i ~ 1, . . . ,n : 

[H^{s)-H^{s)]c,{s)=0{s + \f% Us)"[H^{s)-H^{s)] = 0{s + X,)'^% (4.18) 

where bf (s) :~ 4'\.{s)B^ andci{s) := Ci4)^{s). 

We now turn to the gradient of J versus A. We do not have expressions for 
TI^{V ^jy Sj that are clean extensions of ()4.5|) and ()4.6|) . however, we do generalize 
the two-sided tangential interpolation condition (|4.9|) that follows from V^J' ~ 0. 
This yields the following main theorem, which states the complete generalization of 
CoroUarv 14.21 to higher-order poles, i.e., the characterization of stationary points by 
means of tangential interpolation conditions. 

Theorem 4.8. With the notation and assumptions of Theorem \4-.5\ ifS7gJ ~ 0, 
^ qJ = and V^^/ = 0, then the following tangential interpolation conditions are 
satisfied for i = !,...,£: 

[H^is) - if^(s)]Q;(s) = 0{s + %)''', (4.19) 

Us)"[H^{s)~H^{s)]=0{s + X,)'^', (4.20) 

Usf[H^is)-H^{s)]Ms)^Ois + X,)''^^, (4.21) 

where bf{s) := ip-^XsYB^ andci{s) := Ci(j)^Xs). 

Proof. Conditions (|4.19p and (|4.20p were obtained in CoroUarv 14. 71 It remains to 
show that (|i?^ holds. 

We can interpret conditions ()4.19p - ()4.2ip in terms of Taylor expansions of the 
error function E{s) := H{s) — H{s). Let 

oo ki ki 

ii;(s):-^i?.(,s + AO^ cM:=J2his + Ky, 6f (s) := ^ rf (s + A,)^ 
j=a j=a j=Q 
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be the Taylor expansions around s — —Xi of the rational function E{s) and of the 
polynomials Ci{s) and hi{s)^ . Then conditions (|4.19p - (|4.2ip are respectively equiva- 
lent to 



E^ E^ 



K-i 



H 



E( 
E^ 



Iq li 

lo 



ki-1 



h 



= 0, 



(4.22) 



^,// ^/7 



^H 



and 



„H ^H 



r.H 



2fci 



1 



fci-1 






H pff 



E^ 



H 



Ei 

E^ 



EU 



E? 
E^ 



0, 



(4.23) 



,_i Eq E^ 



^2fc,-i 'o I 






E? 
E^ 



'2fci-l 



^1 
^0 



= 0. 



(4.24) 
The condition that the first ki or 2ki terms of the Taylor expansion vanish is in- 
deed equivalent to the fact that the above partial convolutions are zero. We know 
that (J4f22l) and (14231) hold, since (|4l9l) and (j4:20l) hold; it remains to show (|4:24| to 
conclude the proof. 

We will need the identity 

Ek, ■■■ -c.2fc,-l 



E? 



El 



B^iA^ + Xd) 



-ki 



S^(A^ + A,/)-'=' 



iA^ + xjr^c^ ... (A^ + xjy^^c^ 



-inT 



{A'' + xar^c 



(A^ + A,/)-'='C^ , (4.25) 



B^{A^-rxa)-\ 

which holds since 

£;f+,_, = B^{A^ + xd)-^{A^ + A,/)-^c^ - B^{^ + xdY^^ + xar^c^. 

Define 

Y, := YS,, Q, ~ ~QS,, Xf = -T^ X^ , Pf := -TfP. (4.26) 
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Using Wilson's formulas (Theorem 13. 2p for the first equality, Lemmas 14.31 and 14.41 for 
the second one, and the identity (|4.25|1 for the third, we have 



-Pi Qi - ^i Yi 



„H 






'l 
^0 



B^iA^ + XJ)-\ 



(4.27) 



(4.28) 



{A^ + Xjy'C^ ... {A^ + Xd)-'^'C^ 



Iq li 

lo 



l-ki-l 



h 

lo 



^H ^H 



r-H 









B^{A^ + Xd)-''' 
B^{A^+%I)-^ 



{A^ + xar^a 



-inT 



-k,r<T 



[A' +XJ)-'''C 



Iq li 

lo 



'fci-1 



h 

lo 



'0 ' 1 









K 



Ei 



H 



^2fe.-l 



e!^. 



lo h 
lo 



'■ki-l 



h 

lo 



(4.29) 



We are now ready to show (|4.24|) . Since ()4.22p and (|4.23|1 hold, the left-hand side 
of (|i:M)) satisfies 



' 2ki--l 



„H 



r„,H „H 



1 



E^ Sf 



^H 



fei-1 






E^ 



EZ 



Ei 



H 



E2k,-1 



E^ 



Eiki-i 



El 



lo h 
lo 



lo h 
lo 



l'2ki-l 



h 

lo 



lki-1 



h 

lo 









(4.30) 
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where the first equahty follows from a careful blockwise inspection, and the second 
equality uses (gi^. Since V^J = 0, it follows that ([4241) holds, and thus ([43T|) 
holds, n 

4.3. Number of parameters and conditions. In this subsection, we show 
that the tangential interpolation conditions obtained in Theorem 14.81 — i.e., (|4.19p - 
(|4.2ip — impose the correct number, n(rn + p), of nonredundant scalar conditions. 

To this end, fix i and consider the Jordan block of size ki associated to A^. The tan- 
gential interpolation conditions arc equivalent to (|4.22p - (|4.24p . Both ()4.22p and (I4.23P 
agree on imposing that 



E^ 



E? 
E^ 



EZ-, 



E? 
E^ 



has a kernel of dimension ki. Indeed, the fact that the realization is observable imposes 
that i!o 7^ 0, and thus the ki colunurs of 



/q ll 



h 



h 
lo 



arc linearly independent. This counts for ki conditions. Next, in (|4.22p . the equations 
in columns 1 to ki — 1 are redundant with the equations in column ki. There are 
thus kip conditions, but the left-hand matrix is known to have a kernel of dimension 
ki] this reduces the number of nontrivial conditions to kip — ki. The same reasoning 
on (|4.23|) leads to kim — ki conditions. Finally, once (|4.22p and (|4.23|) hold, the two- 
sided condition (|4.2ip . equivalent to (|4.24p . imposes ki additional conditions. This is 
because the left-hand side of (|4.24p reduces to (|4.30p . a Toeplitz matrix with only ki 
nonzero diagonals. In total for i, we have ki{m + p) nonredundant conditions. The 
overall total is thus X]i=i kilrn + p) = n{m + p), which is the dimension of Rat" „. 

5. Relation with tangential interpolation by projection. The gradient 
forms of Theorem 13 . 2 1 yields the following theorem (proved in |VG A08j ) that provides 
an important link to tangential interpolation by projection. 

Theorem 5.1. At every stationary point oj J (|2.7p where P and Q are invertible, 
we have the following identities 



A = W^AV, B = W^B, C = CV, 



W^V 



In 



(5.1) 



where W := —YQ ^ , V := XP ^ and X , Y . P and Q satisfy the Sylvester equations 

If we rewrite the above theorem as a projection problem, then we are constructing 
a projector 11 := VW'^ (implying W^V = !„) where V and W are given by the 
following (transposed) Sylvester equations 



{QW^)A + l^(giy^) + C'^C ^ 0, A{VP) + {VP)A^ + BB^ = 0. 



(5.2) 
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Note that P and Q can be interpreted as normalizations to ensure that W^V — In- 
Rewriting the Sylvester equations (|5.2p as 



(5.3a) 
(5.3b) 



W^A + {Q-^AQ)W'^ + {CQ-^)C = 0, 
AV + V{PA^P-^) + B{B'^P-^) = 0, 



shows the relation with the tangential interpolation described in |GVV05| . There it 
is shown that when solving two Sylvester equations for the unknowns W,V € M^^" 



W^A - E^W^ + L^C = 0, 
AV - VY.„ + BR^O, 



and constructing the reduced-order model (of degree n) as follows 

iA,B,d) := {{W^V)-'^W'^AV,{W'^V)-^W^B,CV), 



(5.4) 
(5.5) 



(5.6) 



amounts to a tangential interpolation problem (provided the matrix W"^V is invert- 
ible). The "interpolation conditions" (S^,i?) and (S^,i) (where S^,I]^ G M"""", 
R £ R™x" and L S M^^") are known to uniquely determine the projected system 



(A, BjC) |GVV05| . Moreover, they reproduce exactly the conditions derived in the 
previous section since they can be expressed in another coordinate system by apply- 
ing invertible transformations of the type [Q~^Y,aQ,RQ) and {P~^Yi^P,LP^ to the 
interpolation conditions. This yields transformed matrices VP and WQ but does not 
affect the transfer function of the reduced-order model (A, B, C) (see jGVV05| for 
more details). The novelty of the derivation in this paper is the case of higher-order 
poles: the tangential interpolation conditions in Theorem l4.8l contain fewer redundant 
equations than those that would follow from [GVV05| . 

6. First-order versus higher-order poles. In this section we show that 7^2- 
optimal reduced-order models with repeated poles can indeed occur and that in their 
neighborhood one can expect the tangential interpolation approach to have serious 
numerical difficulties. We start with a lemma that will allow us to demonstrate this. 

Lemma 6.1. A stable n-th degree transfer function H{s) = C[sln — A)~^B is 
a stationary point of the error function \\H{s) — H{s)\\'i-i^ if and only if H{s) can he 
realized as follows 



A Ai2 
A21 A22 



B 



B 

B2 



C 



C C2 



(6.1) 



where moreover 



AP + PA^ + BB^ ^0, A2iP + B2B^^0, 



(6.2) 



QA 



A^Q 



d^c 



0, QAi 



d^C2 



0. 



(6.3) 



Proof. The proof follows from the stationarity conditions in Theorem 13.21 The 
"if" part is direct: the stationarity conditions hold with ^ = [{^] and Y = — ^ . 
For the "only if" part, the assumption that H{s) is stable and of degree n, guarantees 



16 



P. VAN DOOREN AND K. GALLIVAN AND P.-A. ABSIL 



that the matrices P and Q exist and are invertible. Using Y^X = —PQ one can then 
always choose a coordinate system for the reaUzation of H{s) in which 



and hence 



X = 



w = xp-^ = 



p 





Y = - 



V = -YQ-^ = 



In 




Therefore we have An ^ A, Bi = B, Ci = C. D 



This special coordinate system can be used to construct a transfer function H{s) 
for which a given H{s) is the best 7^2 norm approximation of H[s). 

Theorem 6.2. Let H{s) = C{sln — A)~^B be a given stable n-th degree trans- 
fer function, then there always exists a stable N-th degree transfer function H(s) = 
C{sIn — A)^^B with N > n, for which H{s) is a stationary point of the 7^2 error 
function. 

Proof. It suffices to construct P and Q satisfying the Lyapunov equations in (|6.2p 
and (|6.3|) . and then choose A21 ~ —B2B^P^^ and A12 = —Q^^C'^C2 to satisfy the 
conditions of Lemma 16.11 Notice that this always has a solution since P and Q are 
invertible because H{s) is stable and minimal. In order to guarantee that H(s) is also 
stable, one needs to choose the remaining degrees of freedom, i.e. A22, B2 and C2 
to satisfy this condition. This can be achieved in several ways, but the simplest one 
is to choose A22 stable, and the matrices B2 and C2 sufficiently small. The matrices 
A21 = -B2B'^P~^ and A12 = -Q^^C'^C2 wiU then also be small, and A will then 
be essentially block diagonal and hence stable. D 

The above theorem does not show that the constructed stationary point is also 
a local minimum, but the following example shows that this is not too difficult to 
construct. Choose H{s) = l/(s — a)^ with a = — 1 and a realization 

l=[gi],B=[?],C=[io] 

then the transfer function H{s) = {0.25s^ - 0.5s + 9.25)/{s-^ + 7s^ + 19s + 9) with 
realization 



A 



fa 1 dl 




foi 


Q a e 


,B = 


1 


e d f 




.9. 



,C^[lOg] 



with / = —5, g = .5, d — ^ag, e — Aa^g, is stable and satisfies the stationarity 
conditions of Lemma 16.11 Moreover. 1000 random perturbations of the stationary 
point H{s) show that this is clearly a local minimum of the error function \\H — HW-h^. 

This example shows that if we aim for an 7Y2-optiinal reduced-order model H{s) 
with multiple poles, the model reduction technique that restricts itself to first-order 
poles will not be able to produce that solution. However, what happens if we perturb 
H{s) or H{s)l What can we say about the mapping from one to the other? This is 
addressed in the following theorem, which shows that if H{s) is a stationary point of 
the 7i2-distance to H(s), then every sufficiently nearby transfer function H^(s) is a 
stationary point of a nearby system H/y^{s). 

Theorem 6.3. Let H{s) = C{sl„ - A)-^B and H{s) = C{sIn - A)-^B be 
stable and minimal transfer functions such that H{s) is a stationary point (resp., 
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nondegenerate local minimum) of the error function \\H{s) — H(s)\\t-c2- Then, for 
every neighborhood lA of H{s) in Rat" j, there exists a neighborhood lA of H{s) in 
Rat ^ such that, for all Hj^{s) G U, there exists Ha{s) G U for which H/^{s) is a 
stationary point (resp., nondegenerate local minimum) of the 'H2-distance to H/^{s). 

Proof. The proof consists of constructing a continuous mapping -0 from a neigh- 
borhood V of H{s) in Rat"„ into Rat such that Ha{s) is a stationary point of 

the 7i2-distance to iP{Ha{s)) for all Ha{s) in V. We use Lemma l6.ll to do this. 
Let {Aa, Ba,Ca) be a nearby realization of the nearby system Ha{s). The solu- 
tion Pa and Qa of the perturbed Lyapunov equations in ()6.2p and (|6.3p . will be 
close to P and Q by continuity of the solution of a non-singular system of equations. 

hBlP^' 



For the same reason we can construct nearby solutions A21A = ^B2B^P^ and 



^i2A = — Qa^^a^2 to finally yield a realization 



Aa 



Aa Ai2A 

A21A A22 



Ba 



Ba 
B2 



Ca 



Ca C2 



for a transfer function Ha{s) =: iIj{Ha{s)) which is close to H{s) and satisfies the 
conditions of Lemma 16.11 Since, in view of its expression (|2.ip . the 7i2-norm error 
function is locally smooth in terms of the coefficients of system parameters of H{s) 
and H(s), every stationary point that is a nondegenerate local minimum remains a 
local minimum for sufficiently small perturbations. The proof therefore applies to 
such points. D 

This theorem implies that the set of full-order models H{s) that have 7Y2-stationary 
reduced-order models with only simple poles, is open and dense in Rat^„. This fol- 
lows from the following reasoning. From the continuity of the mapping from H{s) to 
H{s) and from the fact that the set of systems with only simple poles is open, it follows 
that, around a system H(s) with reduced-order models with only simple poles, there 
is an neighborhood of systems with reduced-order models with only simple poles. If 
H{s) has a reduced-order model H{s) with multiple poles, then, because the "reduc- 
tion" map is an open map and the set of systems with only simple poles has an empty 
interior, it follows that any neighborhood of H{s) contains a full-order model with a 
reduced-order model with only simple poles. One could conclude from this that one 
need only consider first-order interpolation techniques, but, when one approaches a 
system for which the target function H{s) has multiple poles, the interpolation condi- 
tions change in a non-smooth manner in its neighborhood. The first-order conditions 
will become linearly dependent and they will no longer define the reduced-order model 
uniquely. This is obvious in the SISO case. In the MIMO case, observe that the tan- 
gential interpolation conditions (j4.7p involve the interpolation direction q = Csi, 
where s,; is the eigenvector of A related to A^; if Xi and A^+i coalesce to form a non- 
trivial Jordan block, then the eigenvectors Si and Si+i merge (see jWil65| ) and hence 
the tangential interpolation directions merge, too. This implies that the systems of 
equations that one solves become ill-conditioned in the neighborhood of a point where 
the solution has higher-order poles. The same ill-condioned behavior can be expected 
for any target system H{s) which has no higher-order poles but is near a system with 
higher-order poles. 

7. First- and complex second-order approximation. In this section we 
consider how the error function changes with the interpolation conditions. In order to 
analyze this, we look at first- and second-order approximations only, i.e., approxima- 



18 P. VAN DOOREN AND K. GALLIVAN AND P.-A. ABSIL 

tion by systems with one real pole or two complex conjugate poles. If we are looking 
for a (real) first-order approximation 

H{s) = cb^/{s - A) 

then according to the formulas of Section [H it should satisfy the following properties 
at every stationary point of J7 : 



If we arc looking for a second-order approximation with complex conjugate poles 

H{s) = cb"/{s - A) + cb"/{s - A) 
then it should satisfy the following properties at every stationary point of J^ : 



H„ uH 



c c 



b"b „ d T,, , , b^bc^c 



In both cases, the first two equations express that for every interpolation point —A 
(real or complex) one should choose left and right singular vectors of H'^{—X) as 
tangential interpolation directions b and c for constructing the first- and second-order 
section. The third equation (combined with the two previous ones) expresses that the 
interpolation point is a stationary point of the error function versus A. 

If we keep the interpolation point as a parameter, we can plot the error function 
versus —A, but where b and c are chosen optimal for that interpolation point. In other 
words, the optimal approximation H{s) is then completely defined by the interpolation 
point —A. We can therefore have a look at the function we need to optimize by plotting 
the error function ||i?(s) — i?(s) |||^ as a function of A. It follows from the optimality 
conditions on 6 or c that \\H{s) - ff(s)||^^ = ||i/(s)||^, - ||^(s)|||^^. Indeed, let 
\/gJ = Y'^B + QB ^0 then 



\\H{s) - i?(s)||^, = tr [B' QB + B^YB + B'Y^ B + B^ QB 
= tr (b'^QB) - tr (b'^QB 

2 
H2- 



\H{s)\\l^-\\His)W^ 



The development for Vgj7 = CX — CP = is essentially the same. In the real case 
we then have 

|li?(.)||^,=6^i?^(-A)c=^ 

which implies ||iJ(s)||^ = _^~ '' because of the above formulas. This indicates 
that we need to choose the vectors b and c corresponding to the largest singular value 
of H{X). In the complex case we have 

\\H{s)rn^_ = 25R {b"H^{-X)c + b^H^{-X)c 

and the same conclusion follows after some manipulation. 
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In Figure 17.11 we show this function for a MIMO example with m = p = 2 and 
N = 20, for which the optimum is reached at a pair of complex conjugate interpolation 
points. Subplot 1 shows the poles of H{s) (blue crosses) and the poles of the H2- 
optimal reduced-order model H{s) (black circles). Subplots 2 and 3 show the log of 
the 7^2 norm of the error as a function of the interpolation point —A (both in contour 
and in 3D view). Subplot 4 shows the frequency response norms <Jma.x{G{jiu)), where 
G{s) is the system H{s), the optimal second-order approximation H{s) and the error 
H(s) — H{s). This system was generated randomly, but the function is not so simple 
to optimize. It is clearly not convex and there are several basins of attraction to local 
minima that are not optimal. One often recommends to start with the poles closest 
to the juj axis as interpolation points (or the largest peaks in the frequency response), 
but for this example that would converge to local minima, as one can see from the 
7^2 error plot. 

Fig. 7.1. Second-order approximation of MIMO case 
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8. Algorithms for solving the interpolation problem. One can view (j3.2|3.3p 
and (|5.ip as two coupled systems of equations 



(X,y,P,Q) =F(^,B,C) and (A, S, C) = G'(X, F, P, Q) 
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for which we have a fixed point {A, B, C) — G{F{A, B, C)) at every stationary point 
of J [A, B, C). This automatically suggests an iterative procedure 

{X,Y,P,Q),+,=F{A,B,d).,+i, (AB,a),+i =G(x,y,p,g)„ 

which is expected to converge to a nearby fixed point. This is essentially the idea 
behind existing algorithms using Sylvester equations in their iterations (see jAnt05| V 
Specifically, this is the idea behind the IRKA algorithm of |GAB07| . except that one 
has to adapt the formulas to make sure that the matrices V and W satisfy W^V = In- 
Another approach would be to use the gradients (or the interpolation conditions of 
Theorem 14. 1|) to develop descent methods or even Newton- like methods, as was done 
for the SISO case in |GAB07| . Quasi-Newton methods where the optimal variables are 
the interpolation points were developed in |BG07| . Such local optimization methods 
allow for local superlinear convergence to local minimizers of the error function, but 
cannot guarantee global convergence to the global minimizer. The analysis of Section^] 
also shows that using the diagonal canonical form for such algorithms may lack the 
required robustness properties. 

9. The discrete-time case. Now consider the equivalent formulation in the 
discrete-time case. We then have the dynamical systems 



Xk+i = Axk + Buk 
yk = Cxk 



and 



Xk+i = Axk + Bu 
yk = Cxk 



with transfer functions 

H{z)^C{zI- A)-^B, and H{z) = C{zl - A)-^B. 
The squared 7Y2-norm of the error function E{z) := H{z) — H{z) is then defined 



as 



/OO J po 

E{enE{en"i^ = tr^(CeA^i?e)(aA^i?, 



(9.1) 



where {Ag, Be,Ce) defined in (|2.4p is again a realization of the error transfer function 
E{z). The 7Y2-norm can now be rewritten in terms of the solutions of the Stein 
equations 



AePeAj + BeBj 



P. 



Ag QeAe 



cTa 



Qe 



(9.2) 



as 



J = tr (CePeC^) = tr (BjQeB,) 

Partition again the solutions 



P. 



P 

X^ 



X 

p 



Q 



Y 



to obtain the Stein equations in the form 
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X 1 


r A'^' 
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B^' 
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X 1 
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\ Q 


Y 1 
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\ ^^ 1 
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Y 1 
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Q . 



by 
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Theorem 9.1. The gradients V^J7, V gj and ^ qJ of J :— \\E{s)\\y^^ are given 



V^J = 2{QAP + Y^AX), VgJ:^2iQB + Y^B), VgJ ^ 2{CP ~CX), (9.3) 
where 



A^YA - C^C = Y, A^QA + C'^C 



(9.4) 



AX'^A^ + BB^ = X^, APA^ + BB^ = P. 



(9.5) 



Setting the gradient of J to zero yields the stationarity conditions derived in |BKVW07] . 
These arc the discrete-time counterpart of Wilson's conditions (see |Wil70| or Theo- 
rem [XI]). 

Again, at a stationary point (where all gradients are zero) we have that the 
projection matrices 

W-.^-YQ-'^, V := XP-^ 

satisfy A = W^AV, B = W^B, C = CV, W^V = / and the Sylvester equations 

f A'^{QW^)A + d^C^{QW^) 
\ A{VP)A:^ + BB'^ = (VP) 

indicating that we are solving a tangential interpolation problem in the inverses of 
the eigenvalues of A, and this both left and right. 

Let us now look at the tangential interpolation conditions for the discrete-time 
case. We treat immediately the higher-order case and specialize afterward to the case 
of order 1 interpolation conditions. Lemmas 14.31 and 1 4.41 have the following analogues. 

Lemma 9.2. IfX^^ is not an eigenvalue of A, the solution of the matrix equation 

A -1 

A ■■. 



A'YF -Y ^C' L with F : = 



-1 
A 



■^kxk 



with L := \£o ii 



ik-i] , is given by 



Y = 



{\A' -i)-^a 



inT 



^^'"'(AA^-/)-^-C^ 



■4 h 
4 



Kfc-l 



h 
4 



Moreover, let 



6a(z):=[1 (A-z) ... (A-z) 



fc-ii 



, y[z):=Y<i>^[z) 
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then 
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y{z) = [zA^ - I)-^C^L(l)x{z) + 0(A - z)* 



which means that the ith column yi of Y is also the coefficient of (A — z)*~^ in the 
Taylor expansion of (zA^ — I)~^C'^ L(j>\{z). 

Proof. The first part easily follows from {XA^ — I)yi = C'^£o and {XA^ — I)yi = 
C'^£i-i + A^Hi-i, i > 1. The second part follows from the identity 

oo 
1=0 

and from the convolution of this formal series with the polynomial vector L(j)\{z). D 

We give the dual version of this lemma without proof. 

Lemma 9.3. If X^^ is not an eigenvalue of A, the solution of the matrix equation 

FX"A^ ~X" = R"B^ 

with F G {^kxk ^g above and R := Wk^i rk-2 ■ ■ ■ '<'o\, is given by 

"B^A^'-'(AA^-/)-^ 



X^ = 



„H ^H 



r" 



^H 






B^A^(AA^-/)-2 
B^(AA^-/)-i 



(A-z) 1], x" [z) -.^ i^^{z)X" 



Moreover, let 

then 

x"{z) = Mz)R"B^{zA^ - I)-' + 0{X - z)^ 

which means that the ith row xf of X^ is also the coefficient of (A — z)*~^ in the 
Taylor expansion of ip\{z)R^ B^ {zA^ — ^)~^- 

This now leads to the following theorems with interpolation conditions in terms 
of the transfer function H^,{z) := z^^ H^ [z^^) : 

oo 

H,{z) := i?^(/ - zA^r^C^ = -^(A - zyB^A^'iXA' - I)-'-^C^. 

1=0 

Since the proof is essentially the same as the one for the continuous-time case, it is 
omitted here. 

Theorem 9.4. Let H{z) = ELi^»(^)' H,{z) -.^ d,{zl - A^y^Bf^ where 
{{Ai, i?f^, Ci) : i = 1, . . . ,1} is a self-conjugate set and Ai is just one Jordan block of 
size kj associated with eigenvalue Xi, and where X~ is not a pole of H{z) or H{z). 
Then with 



%{z)" :-- 



(A.-Z) 



fei-i 



(A, -z) 1 



B", 
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T 



fci-1 



have 



Ci{z) := Ci 1 {Xi~z) ... (Xi-z) 

[7jJ(z) - #f (z)]q(z) = 0{% ~ zp, (9.6) 

h.,{zf[Hj{z) - Hj{z)] = 0{K - zf^, (9.7) 



h{z)" [Hi (z) - Hi (z)]c,(z) = 0{\ - zY'^, (9.8) 

where Si , Ti are as defined in (|4.1ip . 

In the case of first-order poles, the conditions reduce to the following result, found 
in [BKVW07] in an equivalent form. 

Corollary 9.5. For the case of first- order poles (i.e. ki ~ I), the above condi- 
tions become : 

[H^{z) ~ H^izm = 0(% - z), bf[H^{z) - H^{z)] - 0{K - z), 



bf[Hi{z)-Hi{z)]Z, = 0{K-zY. 

If, moreover, m = p = I, we retrieve the 2n conditions described in the SISO result 
of \ML67I : 



H,{K) = H^X,), ^ iJ*(2)|,=5;, = ^ ^*(^) 



i = 1, . . . ,n. 



10. Conclusion. In this paper, we have characterized the stationary points of 
the 7i2-norin approximation error \\H{s) — H{s)\\'^ in the MIMO case, with the 

reduced-order system H{s) in Jordan canonical form. The stationarity conditions 
take the form of tangential interpolation conditions — whose degree depend on the 
size of the Jordan blocks — written in terms of the Jordan parameters of H{s). The 
conditions are thus implicit, which calls for iterative algorithms. However, we have 
shown that the Jordan-based approach becomes ill-conditioned in the neighborhood 
of target transfer functions H{s) with higher-order poles. It is therefore more robust 
to use the interpolation conditions in the Sylvester equation form (Theorem lS.ip since 
the 7^2 norm is smooth in the parameters {A, B, C) of these equations. We have also 
shown that the underlying optimization problem can have several local minima by 
just analyzing the approximation problem by systems of McMillan degree one (with a 
real pole) and two (with complex conjugate poles). The case of discrete-time systems 
has also been considered. 
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